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Preface 

This is the third in a scries of literature reviews on tfie problems 
of skill generalization. ITie three components of this review 
address issues which have developed out of our research and 
application activities rather than from any preconceived need for 
conceptual continuity within the literature review series. 

Liberty's review of research on self-control, self-monitoring, and 
self-reinforcement came about as a direct result cf her intense 
interest in the topic. It may ultimately provide a very useful 
strategy for facilitating generalization. The instructor's need to 
have effective and cost efficient options for assessing 
generalization prompted the Kayser and Billingslcy review of 
assessment procedures. 

One of the greatest realizations that we have had in our 
investigations has come as we entered the application jiiase. It 
has become clear that the teachers and support staff involved 
from the participating school districts need additional u^ning in 
order to apply the recommended intervention procedures. Hence 
the interest of Lynch and McCarty in studying cost efficiency and 
durability of training methods for staff development in terms of 
maintenance of their teaching skills. 

Some Trends Since 1977 

A summary by White, Leber, ar»d Phifer (1985) of research 
studies since around 1977 involving a total of 405 subjects shows 
that substantially more studies targeted functional skills than in 
the Stokes and Baer (1977) review. In addition, many more 
studies were with handicapped subjects and were conducted in 
natural settings, although there still remam a certain number of 
studies involving skills which are not essential for functioning in 
natural settings. Since 1977, of the 115 articles having to do with 
generalization published in 11 journals, 71 (62%) involved 
severely handicapped students. The two journals that reported the 
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greatest number of generalization studies were the Journal of 
Applied Behavior Analysis (JABA) and the Journal <4 the 
Association for Persons with Severe Handicaps (JASH). Forty- 
ei^t percent of the anicles on generalization which v^e reviewed 
came from JABA and 13% came from JASH. 

One of the task':s of this institute is to review all of the well 
controlled and quantified investigations on genera ization that 
have been published since 1977 in tetms of whether the critical 
factors to generalization are likely to contribute to or impede 
progress toward sldU generalization. From analyses of those data, 
factors and consequating affects have been identified (see table on 
following page). 

Access to multiple settings and/or diHerent teachers and students 
during the school day does not sippear to contribute to skill 
generalization, unless those /ariables are identified as part of a 
particular instructional strategy. Simply providing instruction in a 
natural setting also does not promote generalization (unless there 
is only one primary environment); strate^es which are designed 
to improve generalization must oe incorporated into instruction. 
The use of generalization strate^es by teachers appears to be 
more important than the site of instruction in contributing to skill 
generalization. 

Strategies which have been tentatively identified as facilitating 
generalization include (not necessarily in order of their 
effectiveness) : program natural reinforcers; fade training 
reinforcers; use natura) schedules; use natural consequences; teach 
self-reinforcemcnt; "seach to solicit reinforcement; reinforce 
generalized behavior, alter contingencies in the generalization 
situations; vary stimuli using common stimuli, multiple 
exemplars, or general case approach; increase skill proficiency; 
fade training stimuli; train in the generalization situation on site; 
and expand the target skill to increase its function in critical 
situations. 
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Factors Contributing tn nr Impgding Sk ui Cenerali7iition 



Factor 

Type of skill 
instructed 

Usefulness in other 
situations 

lEP criteria 



Level of skill 
mastery 

Level of skill 
fluency 

Opportunity to use 
in other situations 

Type 0. instruction 



Consequences in 

generalization 

situation 

Parents train at 
home 

Competing 
behaviors 



Contributing 

Functional 



Specifics 
generalization 

At or near aim 

Proficient 

Often 



Use strategics for 
generalization 

Reinforced for 
target skill 



Happens 



Controlled or 
not present 



Impeding 

Not functional 



Useful in many Useful in one 



Docs not specify 
generalization 

Acquisition 
levels 

Slower than envi- 
ronmental demands 

Seldom 



No strategies 
for generalization 

Not reinforced 
for target skill 



Does not happen 



Present 
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Characteristics of pupil performance which appear to contribute to 
or impede skill generalization include: level of performance in 
instruction at time generalizaiton is assessed; relMive fIuetK:y of 
target skill and competing skills; type of errors in generalization 
assessments. 

Behaver Control of Stimuhis Events 

The methods we use in teaching skill acquistion {4ay a critical 
role in whether or not the S^s which cue targeted responses 
facilitate or impede generalization. As an exainple, because a 
specific trainer has become a for the response, his absence 
from the nontraining setting may result in a lack of appropriate 
responding. If on the other hand the reinforcement schedules or 
reinforcers included self-control strategies, the student can play an 
active roie in the mediation of those difTercices. In a real sense 
he acts as his own trainer. 3ince we can't predict what variations 
future enviroimieiAts will hole, building strategies for the student 
to use in self-control and decision-making could be an important 
phase of training. 

Assessing Generalization 

One area of behavior change in the literature that has been 
addressed by this report is the review of procedures which are 
being used to assess generalization. There were a total of 48 
articles on assessment of generalization from five journals 
published from 1980-1''^85. For comparison we reviewed 14 
articles published fxym 1970-1975. There were 3.4 times more 
articles which reported data on the assessment of generalization 
from 1980-1985 than from 1970-1975. This marks a significant 
increase in the interest of researchers in assessing the 
generalization of acquired skills. From our review it seems cle^r 
that the assessment of genejulization, while very time consuming, 
is important because without the examination of generalization 
across relevant dimensions in the mimdl environment, the 
findings may have limited educational value. 
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Staff Devetopment 

One of the findings of our review is that the sophistication of 
research on generalization has increased greatly, iii particular the 
research conducted with the severely handicapped. 

In applying research findings from the prweding years, we have 
seen clearly the level of competency required of teachers to apply 
the processes and procedures that are necessary to ensure that 
students will generalize skills across settings, across people, and 
across stimuli, hi fact, only two of the teachers involved in the 
application phase of this study were capable of employing the 
intervention procedures without extensive inservicc u-aining. This 
observation prompted Lynch and McCarly to conduct a review of 
the literattjre on staff development and inservice uaining. 

Our concern with application and replication of the findings in 
school settings stimulated these questions: Can the new 
procedures for enhancing generalization be used by public school 
teachers of the severely haiidicapped as effectively as the project 
staff? Can the new procedures produce results similar to the 
original research findings? Can the new procedures be practical 
and cost effective enough to ensure widespread application in 
school disu-icts? In the case of our findings, which involves a 
long list of strategies found to facilitate generalization, teachers 
cannot readily determine what su-ategy to use with which 
performance problem. Even though a set of decision rules have 
been developed from the research in this project, in the 
"application" jAase, teachers needed a great deal of assistance in 
following the rules. 

As a result of the application studies and the literature reviews, 
we have gained valuable information about the staff preparation 
that should precede the design and implementation of application 
studies. We probably should have titled this section "I wish I 
didn't know now what I didn't know then." hi any event, the 
majority of teachers who are currently teaching the severely 
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handicapped require signiflcandy more training in order to ^ly 
strategies which are known to enhance generalization. In addition, 
the process of deciding which strategy to use in what particular 
circumstance does involve using riiles developed to guide teachers 
in making that decision with more reliaUe results. We have seen 
that teachers can employ these decision rules with specific, 
systematic, and intensive inservice training. 



Norris G. Haring 
Principal Investigator 
Seattle, 1987 
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Behaver-Control of Stimulus Events 
to Facilitate Generalization 



Kathleen A. Liberty 



Most of the research in the area of stimulus control with severely 
handicapped people has been directed at the first instances of a 
response, and how we can manipulate antecedent and consequent 
events to develop p-edictable relationships between events and 
behavior. Research is accumulating which testifies to our success 
in manipulating stimu:us events to promote the acquisition of a 
broad range of skills by severely handicapped persons-persons 
generally considered "unteact ible" two decades ago. 

Our success has brought us new challenges. The very strategies 
which we use to promote acquisition may interfere with 
generalization. By using verbal prompts, we may be making it 
difficult for the student to act when there are no prompts. We can 
avoid this by fading the prompts, models, demonstrations, and 
cues we use in instruction. Our use of high density reinforcement 
during acquisition may also hnpede generalization. We can 
gradually reduce our schedule of reinforcement, and also 
eliminate reinforcers which don't occur in other settings. We 
should introduce a broad range of stimulus events into training as 
well, since providing only a few exemplars also causes problems. 
Research into these and other strategies has been the focus of 
many of our efforts at solving problems in generalization. 

Most of the generalization strategies suggested so far have 
involved changing how stimulus events are manipulated or 
presented to the student. An alternative is to change the 
controller. Instead of control by trainers or teachers, the behaver 
is taught to control events that may influence generalization. The 
shift is reflected in the term "self-control." Awkward as this is to 
use (because when one speaks of "selP-control one is not actually 
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referring to (Wieself), it also carries some cognitive and 
connoialive baggage; so ihe term sclf-mrsnagement has come to be 
used as well* In either case, when that term is used, we are 
identifying the behaver as the manipulator of antecedent and/or 
consequent events which may (or may not) have a functional 
effect cm her/his own responding. 

Although many people have touted the ''promise" of behaver- 
contiol of stimuli, very few research studies have actually 
investigated generalized responding by behavers who have been 
taught to control stimulus events. This review analyzes the results 
of 15 investigations in order to determine how teaching self- 
contit>l affected students* performance in training and 
generalization. 

We first analyzed seven studies^ involving 9 subjects. In these 
studies, the purpose of teaching self-control was to influence 
behavior directly in the training setting. We also analyzed eight 
studies^ involving 16 subjects, in which self-control was taught in 
order to influence generalization. 

In each study, the overall impact of the intervention was 
calculated by determining the product of the net effect and the 
median variability. Net effects of teaching self-control were 
calculated by comparing actual performance at the conclusion of 
self-management training with performance predicted if baseline 
conditions had continued during that period, according to the 
formula: larger divided by smaller (Kazdin, 1976; White, 1971a, 
1974; see Figure M). The net effect encompasses changes in 
both level and trend* and provides a measi'^e of the magnitude of 
the average effect of the intervention. 

The relative variability of performance must also be considered in 
estimating impact in order to eliminate effects which are actually 
encompassed by normal variability in performance. In this 
review, performance variability was calculated for each value in 
baseline, and the median variability was used to represent the 
average amount of change predicted by current performance 
variability. 
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Figure 1*1 

How net effects, variabOityi and overaU impact were assessed 




Net Effect 



Time 



A = Perfonnancc predicted by split-middle trend (Ktzdin, 
1976; White. 1971b, J972. 1974) if baseline 
peifomunce had continued without intervention to 
time at which actual self-control phase ended. 

B = Perfoimance at end of self-control phase, calculated 
at end of split>middlc trend 

C = Actual performance. 

D = Performance summarized by split-middle trend. 
Net Effect = Divide larger of A and B valuta by the smaller and 

detennine direction of effect 
Variability = Divide larger of C and D values by the smaller for 
each perfonnance value in baseline. 
Overall Impact = Net effect divided by median baseline variability. 



When the magnitude of the net eRect is smaller than the 
magnitude of the daily bounce in baseline, the magnitude of the 
overall impaa is leas than 1.0. In these cases, the amount of 
change during intervention is within the student's nonnal 
behavior range prior to intervention (Figure 1-2), and thus the 
overall impact is probably insignificant Overall impact was 
calculated by dividing the net efTect* representing changes in both 
level and trend of performance, by the median baseline 
variability, rqresenting the relative amount of change predicted 
prior to intervention as part of the student's normal perfomiance. 

Table 1-1 lists the net effect, median baseline varialMlity, and the 
overall impact on performance in training situations of teaching 
subjects self-control strategics. Of the 34 performances analyzed, 
cae declined, eight of the changes were within the subject's 
lormal variability of performance, and 25 performances 
improved. By response class, inappropriate behavior showed the 
greatest impaa; however, expressive commimication was the only 
category in which everyone's perfomiance improved, or showed 
no change. 

Training performance improved for 73.5% of all subjects as a 
result of self-control training (Figure 1-3). However, only 50% of 
the severely handicapped subjects' performances improved, as 
compared to 90% of the other subjects. The magnitude of 
improvement ranged from 1.1 to 32 times greater than what was 
prediaed from baseline levels. 

For 20 of the 34 cases, performance was also assessed in 
nontraining situations. Generalization to unuained instances, 
untrained settings, and across time and untrained subjects uas 
included in this sample. Table 1-2 lists the overall impact on 
performance in generalization. In 16 of 20 instances (80%), 
generalization improved as a result of self-control training (Figure 
1-4). Improvement was shown for 91% of the severely 
handicapped subjects and 67% of other subjects, just about 
reversing the fwoportions whose training performances improved. 
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Figure 1-2 
Insigniflcant Impact 



2 - 



3 - 



4 - 



5 - 



6 - 




Net Effect 



1 - 



T I I I I I 1 — I 1 1 1 1 1 1 1 1 



A = Pcifoimance prcdiucd by split-middie trend (Kazdin, 

1976; White. 1971b. 1972. 1974) if baseline perfonnincc 
had continued without intervention to time at which 
actual self-control phase ended. 
B = Peifoimance at end of self-control phase, calculated at 
end of split-middle trend. 
CI = Most variable point 
C2 = Least variable point 

C3 = Middle variable point (i.e.. median variability = 1.4). 
D = Trend in baseline. 



Figure 1-2 shows perfonnancc daia where net effect is less than 
median variability during baseline, and thus overall impact is less 
than hO, and not significant. 



Table M 

Impact of teaching tdr<ontrd 



AiUde 



Net Median Baadine OveraU 
Subject Effect Variabflitj Impact 



Worii Rate 

1. Bates, 
Renzaglia, 

& Oeet (1980)^ 

2. Homer, Lahrcn, 
Schwaitz, O'Neill. 
A Hunter (1979) 

3. Jackson & Maitin 
(in press) 

4. Jackson & Maitin 
(in press) 

5. Jackson A. Maitin 
(in press) 

On Task 

6. Buigio, Whimian, 
A Johnson (1980) 

7. Buigio, Whimian, 
& Johnson (1980) 

8. Buigto, Whimian, 
& Johnson (1980) 

9. Fantuzzo, Harrell, 
& McLcod (1979) 

ICBuigio, Whitman, 
& Johnson (1980) 



Subject 1^ (1.1)^ 



Thil" 

Subject 1 
Subject 2 
Subject 3 



"Judy" 
(math) 

"Judy" 
ponies) 

"Judy" 
(printing) 

•*Ron" 



"Angie" 
(math) 
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1.1 
1.4 
1.6 

1.0 
9.0 
(6.0) 
(1.1) 
10.0 



1.0 
1.0 

1.1 
1.1 

1.3 
1.2 
1.4 
14 



1.0 

1.9 

hi 
1.4 
1.5 

1.0 
6.8 
5.0 
1.0 
7.1 
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Notes 

1. lulicized subjects arc severely or profoundly handicapped persons. 

2. Values in parentheses indicate that perfonnance worsened. 
Values withcut parentheses indicate that performance improved. 

3. The higher the value the greater the bounce. 

4. Effects of self*adnunistered reinforcement only (excludes effects of 
changing criterion for reinforcing work rate). 
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TaUc 1*1 (contiDued) 



Net Median Baseline Overall 
Subject EfTcct VarUbaity Impact 



ArUde 



11. Bufgio, Whitman, "Angie" 
& Johnson (1980) (phonics) 

12. Qurgio, Whitman, "Angic" 
& Johnson (1980) (printing) 

13. Homer & Subject A 
Brigham (1979) 

14. Homer & Subject B 
Brigham (1979) 

Inappropriate Behavior 

15. Gardner, Clecs, Subject 1 
& Cole (1983) (ulks to selQ 

16. Gardner, Clecs, Subject 1 

& Cole (1983) (ulks to others) 

17. Gardner, Cole, '^Roger" 
Bcny, & 

Nowinski (1983) 

18. Gardner, Cole. "Sue" 
Beny, & 

Nowinski (1983) 

19. Rosine & Martin 
(in press) 

20. Rofinc & Martin "fl" 
(in press) 

21. Rosinc & Martin "C* 
(in press) 

22.01Icndick (1981) "^DavUt 



5.0 1.5 3.3 

3.5 1.2 2.9 

20.0 1.4 14.3 

9.0 1.1 8.2 

10.0 1.0 10.0 

7.0 1.0 7.0 

28.0 1.3 21.5 

40.0 1.6 25.0 

10 1.2 1.0 

(5.0) 3.1 U.6) 

15.0 1.5 10.0 

35.0 1.1 32.0 
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TaUe 1*1 (coaUmicd) 



Net Median Baadtne Overall 
Subject Effect VarUbOity Impact 



Artlde 



Expratdve CaminuiUcation 



23.Ubeny (1984a) 




24.Librrty (1984a) 




Zj.Uocfty (1954a) 


lAM 


26.Libeity (1984b) 




27.Libcity (1984b) 


"Sam" 


ZV.Ubeity (I9o3) 


oneuy 


29. Harris & Graham 


"Rachel" 


(in press) 


(action words) 


30. Harris & Graham 


"Rachel" 


(in press) 


(action helpers) 


31. Harris & Graham 


"Rachel" 


(in pcess) 


(describing words) 


32. Harris & Graham 


"Jim" 


(in piess) 


(action wuMs) 


33. Harris & Graham 


"Jim" 


(in press) 


(action helpers) 


34. Harris & Graham 


"Jim" 


(in press) 


(describing words) 



1.6 1.5 1.1 

1.3 1.0 1.3 
1.0 12 1.0 
i.O 1.2 1.0 
1.0 1.0 1.0 
1.0 1.2 1.0 

1.4 1.3 1.1 

13.0 1.0 13.0 

4,0 1.1 3.6 

1.4 1.3 1.1 

7.0 1.0 7.0 

Z3 1.3 1.8 
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Figure 1-3 

Impact of salf<control on training performance 



Severely/Profoundly 
All Subjeoi Handicippcd 




Figure 1-4 

Impact of self-control on s^neralization performance 



All Subjecu 



Severely/Profoundly 
Hmdi capped 




N =34 



N = 14 



erJc 



No change. 

Change less than median variabiliiy. 

□ Improved. 22 
H Worse. 



T*Mt 1-2 

Impact of tcir«€OBtrol on tcamriiflitkMi 



Attldc 



— QwaM Imjptt Gcncndfatitloii 
SubjMt Trtittliv Gcacnlfacation Dimmioii 



UntrfthiMl lBfUiic«ag 
1. Ubeity (1984a) 

Z Ubcrty (19«4t) 

3. ubaty (1984t) 

4. Liberty (1984b) 

5. Liberty (1984b) 

6. Liberty (1983) 

Untrained Scttinfis 

7. Rosine A, Maitin 
(in prtsf) 

I. Romie A Mtftin 
(inpvesi) 

9. Roaiae A f/Mki 
taoOaaficfc (1981) 



"Co/T^ 1.1^ 
"Cy/f" 1.3 



''Lisa'* 



Voe" 



"Sam" 



"A" 



1.0 



1.0 



1.0 



"ShiUy" 1.0 



1.0 

(L<) 
10.0 
SIJ 



Z8 across untrained 
tnitanoes 

i.o across untrained 
instances 

1.0 across untrained 
instances 

1.6 across untrained 
instances 

1.5 across untrained 
instances 

1.3 across untrained 
instances 



15.2 across settings 



10 scrou acuin|s 



4.6 acfoii acttint* 



31.8 acaou sdtinii 



NoMtt 

1. Italidod aabjectt ate severely ar profoandly Iwidicapped peimi. 
Z Hie figuits in this column ate tqseaied from Table M. Valuer in 

parentbcaea indicate that perfonnance worsened. Values without 

paiemheses indicaie that perforniance improved. 
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Tabic 1-2 (cotttiimcd) 



Aitide 



CivMraM impart General tz»tk>n 
Subject Timtning Generalization Dimeniion 



AaroM Time 



11. HtfTii k, Gnham 
(in presi) 

12. HatTii Gnham 
(in pcesi) 

13. Hanii A Graham 
(jtn pcesi) 

M.Harrii St Graham 
(in press) 

15. Harris St Graham 
(in press) 

16. Htfris it Graham 
(in preu) 

17.0Uendick (1981) 



nUchel" 1.1 
(action words) 

TUchcI" 13.0 
(action helpers) 

TUcheP 3.6 
(describing words) 

"Jim" 1.1 
(action words) 

"Jim" 7.0 
(action helpers) 

"Jim" 1.8 
(describing words) 

"David" 312 



I.O over time 



15.0 over time 



3.8 over lime 



1.0 over lime 



1 0.0 over lime 



(1.3) over lime 



31.8 over lime 

(also over settings) 



18. G&Tdner, Cole, 
Berry. St 
Nowinski (1983) 

19. Gardner. CoTe, 
Berry. St 
Nowinski (1983) 

Untrained Subjecu 

20. Ftnluz2o. Harrelt 
jft McLcod (W9) 



"Roger" 21.5 



"Sue" 25.0 



Ton" 



21.5 over time 



25.0 over lime ► 



*'Etii" across untrained 
2. 1 . subject 
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Not only did self-control instruction improve a greater proportion 
of gencraliztd pcrformimcc as compared with training, the 
magnitude of improvement was greater as well. The median 
improvement for training was l.lx v/hile the mcduui for 
generalization is 2.8x greater than predicted without self-control 
training. 

These results are only suggestive, since data from only a few 
subjects are available. However, ibcy do suggest thai self-control 
may be effective in faciliuuing generalization, especially with 
severely handicapped subjects. These results suggest leaching 
self-control in order lo promote generalization, rather than solely 
to improve classroom perfonnance. 

Generalization may fail to occur for many reasons. For example, 
because a specific trainer has become an for the response and 
he is not present in the new situation, oi because reiniforcement 
schedules differ across settings. Self-control strategics permit the 
student to mediate thc5c diffcrences-to act as his own uaincr. We 
can't predict what future environments will bring, so it will 
require much effort to devise strategies for teachers that will 
produce gcneralizaxion to all situations. 

We can then reqiiire teachers to implement all of these su-atcgics. 
Self-control offers an alternative to this, loo. Instead of uying to 
predia future f,iiuaiions in which a particular behavior may or 
may not occur, we should look at who is behaving. The behavior 
can't occur without a behaver. We could train the behaver in self- 
control skills so that she, in turn, could cope with future situations 
which differ from training rather than continuing dependence on 
u-ainers lo determine environmental events (Dweck, 1975). This 
is the implicii-and exciting-premise of leaching self-control 
skills: functional independence. 
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Footnotes 



^Studies were those included in an earlier review (see Liberty & 
Michael, 1985 for criteria for inclusirn). 

^Studies met the following criteria: training and generalization data 
jvesented for individual subjects; repeated measures of generalization across 
phases. These included three studies previously included in UWRO 
monogn^>hs. 
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Generalization: A Review of 
Assessment Procedures 



Joan Kayser 
Felix F. Billingsley 

The demonstration of generalized treatment effects is becoming a 
prominent feature in the behavior change literature related to 
learners with moderate and severe handicaps. However, while 
there have been advances in methods designed to facilitate 
generalization (see Stokes & Baer, 1977; Homer, Sprague, & 
Wilcox, 1982; Liberty, 1985; Warren, Rogers-Warren, Baer, & 
Guess, 1980), the assessment practices which characterize the 
investigation of generalization have rarely been subjected to 
examination (e.g., Kendall, 1981). 

In order to identify the procedures currently utilized to assess 
generalization, all studies which examined generalized 
perfonnance by leamexs with moderate to profound handicaps 
were reviewed in a sample of the experimental literature for a 5- 
year peiiod, 1980-1985. Forty-eight articles from the following 
journals were included in the review: Journal of Applied Behavior 
Analysis, Journal of Autism and Developmental Disorders 
(formerly Journal of Autism and Childhood Schizophrenia), 
Analysis and Intervention in Developmental Disabilities (initial 
publication, 1981), and Journal of the Association for Persons 
with Severe Handicaps (fontierly Journal of the Association for 
the Severely Handicapped and AAESPH Review, initial 
publication^ 1976). In addition, to provide some degree of 
historical perspective, 14 ^plicable articles were located in the 
Journal of Applied Behavior Analysis for 1970-1975 and the 
Journal of Autism and Childhood Schizophrenia for 1971-1975 
(initial publication, 1971). A bibliography of all the studies 
surveyed is provided at the end of this review. 



17 



r 



F6r purposes of the review* generalization was defined as '*the 
performance of (previously learned) skills in (previously 
untaught) new situations — ^in other school or nonschool settings, 
with other cues or stimuli* with other individuals, and so on** 
(Liberty. Haring, & Martin, 1981). "Learners with moderate to 
profound handid^** refers to individuals classified as moderately 
to profoundly retarded, autistic, or multiply handicapped with at 
least moderate retardation. 

Assessment procedures were examined to answer questions in 
three areas. First, across what dimensions is generalization 
assessed; that is, is generalization commonly measured across 
settings, across people, across stimuli, or across a combination of 
those dimensions? Second, who assesses the learner's 
performance in nontraining situations, and under what conditions 
are generalization observations made? Third, when and how often 
is generalization assessed? 



The assessmen . of generalization has been reported: 

1. Across settmgs (Agosta, Qose, Hops, & Rusch, 1980; Dy, 
Strain, Fulleiton. & Stowitschek, 1981; Eason, White & 
Newsom, 1982; Hill, Wehman, & Horst, 1982; and McGee, 
Krantz, Mason, & McClaraiahan, 1983). 

2. Across people only (Brady et al., 1984; Breen, Haring, 
Pitls-Conway, & Gaylwd-Ross, 1985; Sternberg. Pegnatore, 
& Hill, 1983; and Wacker & Berg, 1984a). 

3. Across settings and people (Handleman & Harris, 1980; 
OUver & Halle, 1982; Strain, 1983; and Warren & Rogers- 
Warren, 1983). 

4. Across stimuli only (Anderson & Spradlin. 1980; Egel, 
Shafer. & Neef, 1984; Homer & McDonald, 1982; and 
Hupp & Mervis, 1981). 

5. Across settings and stimuli (Coon, Vogelsburg, & Williams, 
1981; Schlcien, Ash, Kieman, & Wehman, 1981; and 
Wacker & Berg, 1984b). 



What to Assess 




18 



30 



6. Across people and stimuli (Richman. Reiss. Bauman, & 
Bailey, 1984). 

?• Across settings, people, and stimuli (Kleinert & Cast, 1982; 
Krantz, Zalenski, Hall, Fenske, & McClannahan, 1981; 
McDonnell, Homer, & Williams, 1984; and van den Pol et 
al., 1981). 

According to the studies surveyed for this review, the most 
dramatic change in the type of generalization assessed over the 
past 10 years has been an increase in the measurement of skill 
generalization across settings. Table 2-1 shows the number of 
articles for each 5-year period in which skills were measured 
across each dimension of generalization. In 1970-1975, only 4 of 
14 studies (29%) measured generalization across settings, across 
settings and people, or across settings and stimuli. As indicated in 
Table 2-1, the most frequently measured dimension of 
generalization for that lime period was across stimuli. For 
example, Garcia, Guess, and Byrnes (1973) evaluated the 
generalization of subjects' use of singular and plural declarative 
sentences to novel items presented by the trainer. In a similar 
study (Sailor, 1971), generalization of two retarded females' use 
of plurals in their language was me?^ured across stimuli. In 
1980-1985, 33 of 48 investigations (69%) included cross-setting 
performance in their measurement of generalization. By way of 
illustration. Tucker and Berry (1980) assessed severely 
multihandicappcd students* ability to put on hearing aids in their 
residential living units. In a study by Snell (1982), four males 
with severe mental retardation were taught how to make beds in 
their classrooms and were assessed for generalization in their 
dormitories. 
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Tabic 2-1 

Number of Artidcs Rcportfa^ Gcncrattntion 
Acmi £«di Dimenskm 



GcBcralintloii DimeosloBt # Studies 



1970-1975 1980-19S5 



Acioff tetung only 


1 


11 


Acioff people only 


3 


4 


Aciou fdmuli only 


6 


10 


Aciosf leuing and people 


2 


12 


Acioss letting and itimuli 


1 


6 


AcioM people and itimuli 


1 


1 


AciOM letting, people, and stimuli 


0 


4 


Total = 


14 


4S 



Clearly, researchers have become more concerned with ensuring 
that newly acquired skills are performed outside of training 
settings. Although not the focus of this review, a comparison of 
the types of experimental tasks utilized in these studies suggests 
that the tasks taught from 1980-1985 may have been more 
conducive to the measurement of cross-setting generalization. In 
the 1970-1975 literature sample, 11 of 14 (79%) of the 
experimental tasks were communication-related skills; 2 of 14 
(14%) involved decreasing inapprqmate behaviors; and, in one 
study, social interaction skills were instructed. In 1980-1985, the 
types of behaviors taught were more diverse. Nineteen of 48 
(40%) of the experimental tasks were communication-related 
skills; 9 of 48 (J9%) were self-help skills; 7 of 48 (15%) were 
leisure skills; 6 of 48 (13%) were social interaction skills; 4 of 48 
(8%) were vocational skills; and 3 of 48 (6%) involved 
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decreasing in^jpropriate behaviors. Even though the highest 
percentage of experimental tasks taught during each time period 
involved communication skills, cross*seuing generalization was 
assessed in only 3 of 11 (27%) of the studies in which such skills 
were taught from 1970-1975, whereas in 1980-1985. 11 of 19 
(58%) of the communication skills taught were assessed in 
nontraiiMng settings. It may be that the increase in the percentage 
of communication skills which were measured across settings 
reflects an increase in the number of investigatims which focused 
on functional skills (i.e., skills which might naturally be rec^aired 
in a number of settings rather than experimentally convenient or 
"developmental" commimication skills). 

It is interesting that so few studies reported smdents' perfomiance 
of skills in nontraining settings, with people not involved in 
instruction, and with stimuli different from those used in training 
sessions. In fact, none of the smdies in the 1970-1975 sample, 
and only 4 of 48 (8%) 1980-1985 studies, evaluated 
generalization across all three dinicnsiwis. In a study by Kiantz et 
al. (1981), generalization of verbal descriptors by children with 
autism was assessed separately and sequentially across new 
materials, with a teacher not involved in training, and in a 
nontraining setting. However, van den Pol et al. (1981) assessed 
moderately retarded students* generalization of restaurant skills 
simultaneously across settings, stimuli, and people. 

The most comprehensive assessment of generalization would be 
the systematic evaluation of skill performance across each 
dimensicm separately (i.e., setting, pe<^le, and stimuli), as well as 
across all three dimensions simul ineously. This could provide 
information which might indicate possible reasons for 
uniuccessful generalization or identify otherwise misleading 
instances of cross-situational resporxling. For example, in the 
latter case, if a student's newly learned behavior is controlled by 
the actions of a specific trainer, and the skill is assessed by that 
u^ner under nontraining setting and stimulus conditions, the data 
might suggest that the smdent has successfully generalized the 
skill. It is possible, however, that the student would not use the 
skill in situations where the trainer was not present. The 
systematic evaluation of generalization across the various 
dimensions might be accomplished by evaluating the smdenfs 
performance in other settings, but with the same person and 



stimuli involved during training; with a person who was not 
present du 'ng training sessions* but in the same setting and with 
the same stimuli used in training; and with different materials, but 
with the original trainer present in the training setting. Where 
generalization does not occur across a particular dimension, then 
specific generalization facilitation techniques (e.g^ general case 
instruction, programming natural maintaining contingencies, or 
training loosely) might subsequently be apfriied. If assessments of 
generalization are conducted simultaneously across all three 
dimensions following a systematic analysis related* to each 
condition, it seems likely that the simultaneous assessments will 
provide a more informative account of generalized pcrfomiance. 

The experimental literature seems to indicate an increasing interest 
in the use of "functional" tasks in research involving persons with 
handicaps. Where leamo^ arc instructed in the performance of 
such tasks, consistency with the concepts of social and 
educational validity require that generalization assessments be 
conducted within situations in which trained skills would 
naturally be performed. To illustrate, if toothbrushing is taught in 
a school environment by a student's teacher, then generalization 
should be assessed in the student's home, when s/ht would 
naturally engage in the activity* in the presence of the parent or 
c egiver, and with the materials commonly available in that 
:>etting. 



In our literature sample, the people directly involved in the 
assessment of generalization included trainers, adults unfamiliar to 
the learner, and familiar adults and peers not involved in training 
acting as solicitors and/or observers of the target behavior. In 2 of 
14 (14%) of the studies surveyed from 1970-1975, and in 7 of 48 
(15%) of the studies surveyed from 1980-1985, spontaneous 
performance of the target behavior by the student was observed in 
the generalization setting (i.e. no solicitors were present). For 
example, in two studies (Foxx, McMorrow, & Menncmeier, 1984; 
Ga>iord-Ross, Haring, Bruce, & Pitts-CcHiway, 1984), subjects 
were instructed in social interaction skills and unsolicited 
interaction with peers was measured in the generalization settings. 
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In the nugority of studies, however, the childr,m were uught 
behaviors which were evoked by cues from other individuals. 
The impact of the presence of trainers in the generalization setting 
is unknown. In 8 of 14 (57%) of the studies reviewed from 
1970-1975, and in 27 of 48 (56%) of the studies reviewed for 
1980-1985, the trainer solicited the student's behavior when 
generalization was aisscssed. In 1970-1975 and 1980-1985, 6 of 
14 (43%) and 22 of 48 (46%) of the studies, respectively, 
reported tha: the trainers also were responsible for collecting data 
on subject responses during the generalization probes. For 
examine, in a study by Mc(3cc, Kranu, Mason, & McClannahan 
(1983), receptive labeling of lunch prqiaration items was solicited 
and responses recorded during generalization probes by the 
subjects* house parent, who also provided instruction during the 
training phase. In another snidy in which two autistic children 
were taught "yes" and ''no*' mands in response to desirable and 
undesirable food items, subject responses to untrained stimuli 
were solicited and measiued by the ori^nal trainers (Hung, 
1980). There is some evidence that the implemenution of a 
recording procedure by a trainer may have an effect on the 
behavior of an "observec" and may result in biased dau (Hay, 
Nelson, & Hay, 1980). In addition, when trainers are present 
during generalization probes, the smdies fail to show that the 
learner would perfonn the newly acquired skill in the presence of 
anyone other than the trainer. 

The use of a person unfamiliar to the learner to solicit the target 

response was reported in only one smdy from the 1970-1975 

sample, and in three smdies from the 1980-1985 sample. The 

presence of unfamiliar observers in the generalization setting was 

reported in one study from 1970-1975, and in 12 stodies from 

1980-1985. It is possible that the presence an<Vor participation of 

a stranger in the generalization setting may have an impact on the 

student's performance. For example, the presence of someone 

unfamiliar to a child in his^cr home environment during a 

generalization probe could potentially distract the child from task 

requirements. To illustrate, three autistic children's responses to 

verbal questions were recorded by unfamiliar graduate students in 

the children's homes in a shidy by Handleman and Harris (1980). 

It is possible that the students' highly variable generalization 

scores may have been due to the presence of strangers in their 
homes. 
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A related issue concerns the conditions under which data are 
collected during generalization probes. In the 1970-1975 
Utcrature sample, 10 of 14 articles (71%) reported overt 
observation, 2 of 14 (14%) reported covert observation, and 2 of 
14 (14%) rqwrted both overt and covert observation in the 
generalization setting. In the 1980-1985 sample, generalization 
data were coUected overtly in 37 of 48 studies (77%), covertly in 
3 of 48 studies (6%), and both overtly and covertly in 5 of 48 
studies (11%). In 3 of 48 (6%) of the 1980-1985 articles, the 
manner in vhich observations were made could not be 
determined 

IdeaUy, generalization should be assessed by whomever would 
naturally be present in the generalization setting. If the learner is 
expected to perform the skill spontaneously in the generalization 
setting, then behavioral observations should be conducted 
cover . or perhiqw overtly and as unobtrusively as possible by 
sOTieOTC likely to be present in that situation. For example, in a 
study by Richman, Reiss, Bauman, and Bailey (1984), 
instimtionalizcd, mentally retarded females* perfomiance of 
menstrual care skills was measured by a ward staff member who 
pretended to be busy "straightening up" in the bathroom. 

Frequently children require verbal or physical cues from others in 
order to perform a newly learned skill. In instances where the 
behavior is solicited in the generalization setting, the person who 
would naturally provide cues should solicit the child's behavior. 
To illustrate, the generalization of manual sign use by profoundly 
retarded adults in response to verbal instructions was solicited by 
the teacher in the classroom setting and by a staff member on the 
ward in a study by Duker and Morsink (1984). When behavior is 
solicited, the preferable method of collecting generalization data 
would be covert observation. However, in situations where covert 
observation is impossiWe or impractical, data collection by the 
solicitor may be preferable to overt observation by the trainer or 
an unfamiliar adult 
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When to Assess 



Guidelines for determining when and how often to assess for 
generalization have not been provided in the literature. TTie 
articles surveyed for this review indicated thai the number of 
generalization probes ranged from a single, post-training probe 
session (Eason, White, & Newsora, 1982; Kleinert & Cast. 1982) 
to multiple probs sessions conducted before, during, and after 
training (Agosta, Qose. Hops, & Rusch, 1980; Brady ct al., 1984; 
Coot, Vogelsbcrg. & Williams, 1981; Foxx, McMonow, & 
Meraiemeier, 1984; Stainback, Stainback. Wehman. & Spangiers, 
1983; Storey, Bates, & Hanson, 1984; van den Pol et al., 1981). 
The scheduling of probes in relation to the training i*ase was 
diverse for both 5-ycar periods. As can be seen in Table 2-2, the 
most dramatic cho^c over the past decade was the relatively 
large increase in generalization probes conducted after training 
only (7% for 1970-1975 versus 25% for 1980-1985). It is 
surprising that generalization probes conducted after training only 
were as frcquemly rqwrtsd as probes conducted before, during, 
and after training during the 1980-1985 period. Hie proWcm with 
assessing cross-situational performance only after training has 
occurred is, of course, that no evidence is available to indicate 
that the learner did not perform the skill in the generalization 
situation(s) prior to traimng. In the absence of such evidence, 
generalization cannot legitimately be clatmcd. 

In the 1980-1985 literature sample, wide variation was noted in 
the number of generalization probes conducted before, during, 
and following training. Many investigations contained 
information that would permit the number of probes employed to 
be detennined OTly as a range. Therefore, probe ranges were used 
as a basis for analysis. Where a precise number of probes could 
be determined, that number was included as both iks Jow and 
high end of the range. The range across studies for the number of 
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Tablt ^2 

GcBcndliation Probe Schedullas^ 



Sdwdnliac oT Probcc # Stadlei 



1970-1975 19S0-19ft5 



B«f oie, duriag, mA after tnuninf 


3 


12 


Before and after training 


1 


7 


Before and durinf training 


4 


11 


Duiing training only 


3 


2 


After training only 


1 


12 


I>tring and after training 


2 


1 


Not ttpoited 


0 


3 


Total = 


14 


48 
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probes conducted before, during, and after uaining, and the 
medians for the low and high scores of the ranges, are presented 
in Table 2-3. 



The data regarding median number of generalization probes 
indicate that, on the average, the greatest number was conducted 
during training, and the smallest number of probes was conducted 
following training. It is important to determine that skills or parts 
of skills are generalizing as they ar« being learned, but it is also 
of importance to investigate the maintenance of skills in 
generalization situations following training. Therefore, although 
pos: Gaining generalization probes may extend the length of time 
required for the completion of invesligaUons, it is recommended 
that, where possible, at least some generalization probes be 
conducted over a period of several weeks following training. 



The number of generalization probes required during and 
following insuiiclion may be determined in part by the need to 
dcmonsu^le the student's consistent performance in the 
generalization setting. Multiple, consecutive probes may be 
necessary in cases where the learner must perform the new skill 
with a high degree of reliability. To illustrate, ihc generalization 
of independent street-crossing skills would necessitate multiple, 
consecutive dcmonsu^lions of successful generalization before the 
student would be allowed to cross streets unobserved. 
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Table 2-3 



PoiiiKs) «l 
AdmiaktratlMi 






Range Meilim 


After Training 
Range Medians 


« of 

Studies 


Before, 6tmg 
Bl after tntabig 


M5 


3-7 


1-39 


6-17 


1-20 


3-3 


12 


Before A ote 
tn'ming 


1-37 


3^ 


0 






3-4 


6* 


Before A duiinf 
training 


1.24 


4-10 


1-124 


6-16 


0 




U 


Durinf trtminf 

only 


0 






6-9 


0 




2 


After traimai 
only 


0 




0 




MO 


2-3 


12 


Duiinf A After 
tnlninf 


0 




3-4 




1 




1 


Not repoited 














3 



♦Seven studies iqrniad pit- aud poft-tiaintsv^ probes, but in one stody, the number of probes was not reported. 
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Conclusion 



A review of the generalization assessment procedures utilized in a 
sample of studies from 1970-1975 and 1980-1985 indicated that 
there has been a recent increase in the measurement of 
generalization across settings. This increase may be due to the 
provision of training in skills within a variety of functional task 
areas diat may be required by learners in diverse situations. 

There were no dramatic differences found between the two time 
periods in terms of who assessed generalization, or whether the 
observations of the learner's behavior in the generalization setting 
were overt or covert. The fact that u^ners were frequently used 
as solicitors and observers of the target behavior during both 5- 
year periods suggests that researchers often failed to consider the 
possible effects of using trainers in the generalization setting. 

Cyver the past decade, a relative increase was noted in the 
measurement of generalization following training only. Given that 
data obtained only from post-training assessments piovide weak 
support for assertions of generalized performance, the observed 
increase in such measurement is alarming. 

Comprehensive assessments of generalization may be expensive 
in terms of the time and participants required., but without the 
examination of generalization across relevant dimensions in the 
natural environment, research may result in findings of limited 
educational utility. It was, therefore^ suggested that skill 
generalization be assessed across settings, across people, and 
across stimuli, both separately and simultaneously. The person cr 
persons chosen to solicit and/or observe the learner's behavior 
should be whomever would naturally be present in the 
generalization setting. Ideally, the recording of subject responses 
would be accomplished in a covert or unobtrusive manner. In 
addition, generalization probes should be conducted before, 
during, and after training to ensure that the target behavior has 
generalized and is being maintained in nontraining situations. 
Finally, die number of probes necessary may be determined by 
the need to demonstrate consistent perfonnance in the 
generalization setting. 

There are still many questions concerning the measurement of 
generalization that warrant further investigation. How many 
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examples of generalization across spittings, across individuals, and 
across stimuli are sufficient to demonstrate the leamer's 
successful use of a new skill? It has been suggested that 
generalization be assessed across all stimulus and response 
variations which will be encountered in a defined "universe" 
(Homer, Sprague, & Wilcox, 1982). Fbr some tasks, this is a 
relatively simple procedure; for other tasks, however, assessment 
aaoss all possible variations may require considerable time and 
cffwt. 

The practical logistics of assessing generalization in the natural 
environment deserve additional attention. Since covert 
observation of the learner is frequently very difficult or 
impossible, unobstnisive ways of measuring target behaviors are 
needed Other issues related to the observation of skills in the 
generalization setting include the effects of using the same 
observers across multifde probes and the results of using the 
trainer as an observer, even in cases where the observation is 
covert In addition, the procedural and scoring reliaWlity of 
persons who are not trained as solicitors and/or observers may be 
questionable. Recent research indicated that some parents 
omducting generalization probes in home settings were unreliable 
in scoring responses, in admhistering procedures, or both 
(Kayscr, BiUingsley, & Nccl, 1986). 

The scheduling of generalization probes during tfie training phase 
could have an effect on student performance in the generalization 
setting. Research is needed on the effects of conducting probes 
before versus after training sessions. 

Hnally, although not a toinc of this review, it should be noted that 
the issue of reinforcement in the generalization setting has not 
been resolved in the experimental literature. If sttidents have not 
been introduced to natmal maintaining contingencies, should the 
reinforcement provided in training also be provided in 
nontraining situations? If so, does the provision of such 
reinforcement constitute training, rather than generalization, 
conditions? 
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Extending Research Findings: 
The Role of altaff Development 
and Evaluation 



Valerie Lynch 
Frances McCarty 

In cducaUon. the findings of basic and even applied research 
come from settings devoted to r^^search snd with greater structure 
than >he usual public school setting. Promising new educational 
techniques are frequently disseminated to potential consumers, 
education pcx)fessionals. with little understanding of their 
feasibility, viability, impact, and costs in the public school setting. 

Application or impact studies offer researchers an opportunity to 
further test their findings in school settings in order to determine 
if new methods or procedures (a) can be used by staff (b) 
produce results similar to original research findings, and (c) are 
a)st cffecuve. Two elements characterize application studies. 
First, the researcher relies on methods typically used to introduce 
changes or innovations into school systems. This translates into 
the use of staff development as a means of conveying a new 
technique to school personnel. Secon'i. the investigator minimizes 
the mtrusivencss of the rcscirch on the time commitments and 
activities of the teacher. 

in addition to the usual research concerns, the investigator 
''or.ducting an application study must atiend closely to (a) the 
design and implementation of a staff development program which 
successfully conveys to teachers the metliod or procedure being 
mvestigated. and (b) the design and implementation of a 
comprehensive plij, of evaluation which addresses questions of 
usability, impact, and cost effectiveness of the technique under 
study. In this paper attempt to provide in/brmation about staff 
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development and evaluation which will benefit the researcher 
interested in i^lication studies. 



Conducting application studies in the public schools requires that 
researchers carefully consider how to effect changes in teacher 
behavior. In other words, researchers must address the question 
of how to alter the behavior of public school staff so they can 
successfully implement techniques identiHed through basic or 
applied research. An imderstanding of what is known about staff 
development can assist the investigator in successfully 
transferring a promising method or procedure into the hands oi 
teachers. 

What is Staff Devetopment? 

In education, the term "staff deveU ^ent" is frequently used 
interchangably with "inservice training" and "inservice education" 
by authors and practitioners alike. Although some authors would 
take exception to this practice (Feiman. 1981; Wade, 1984), these 
terms will be considered synonomous throughout this paper. 

In order to {dan staff development activities which result in the 
successful implementation of practices and methods in schools, it 
is important to have » clear understanding of what staff 
develojHnem is and its relation to the needs of the researcher. 
Definiticms by various authws provide a basis for such an 
understanding. Four elements are frequently included in these 
definitions: (a) purpose; (b) approach; (c) beneficiaries; and (d) 
context. 

Purpofie* Perscmal growth of individuals within education* 
professiwud growth of school employees, organizational growth 
ix change, and social change have all been cited as purposes of 
inservice education. To design inservice training and evaluate its 
impact, it is critical to clearly articulate its purpose (Fox, 1981). 

The most commonly idendfied reason for staff development is the 
professional growth of educators (Brookfield, 1981; Dillon- 
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Peterson, 1981; Edlefelt, 1981; Rude. 1978). Prescivice training 
cannot be expected to equip teachers (or the myriad of demands 
and changes that await them during their professional careers. 
Therefore, continuing professional development must occur for 
teachers to successfully fill their complex profesional roles and 
meet unantic?pated changes produced througli technological, 
political, and social shifts which influence education. Inservice 
education designed to produce professional growth must address 
goals and objectives which are intrinsic to those who participate 
(Fbx, 1981). 

Although personal growth of educators has been defined as one 
purpose of staff develc^cnt (Brookfield, 1981; Dillon-Peterson, 
1981), it cannot be the sole purpose of staff development. Only 
when compatible with needs for professional growth can person^ 
development be considered a viable reason for inservice 
education. 

Organizational change or growth has recently received much 
attention as a focus for inservice education in connection with 
calls for school improvement. Dillon-Peterson (1981) summarizes 
the relationship between staff development and organizational 
change by stating: "Staff development and organization 
development are a gestalt of school improvement; both are 
necessary for maximum growth and effective change. They are 
complementary human processes, inextricably complex" (pp. 
2-3). If organizational improvement or change is a desired 
outcome, the goals of staff development will be inu-insic to the 
organization and may be extrinsic to the individual professional 
goals of the educators involved (Fbx, 1981). 

Instances of social change implemented in educational settings, at 
least in part througit staff development, include bilmgual 
education, desegregation, and educadon for the handicapped. If 
social policy implementation is desired*,/ then goals for stiJff 
development need only reflect the legal goals of society and not 
necessarily the goals of the organization or individual 
professionals (Fbx, 1981). 

ApproacL It is generally agreed that staff development is a 
systematic process to achieve specific changes or purposes 
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(Brookfidd 1981; DiUcm-Pctcrson, 1981; Harrison, 1980). There 
aie» however, variations in the definition of steps within the 
process. 

Bennan and McLaughlin (1978), Edge and Fink (1978), 
Ehienberg and Brandt (1976), and Wood, Thompson, and Russell 
(1981) have all described different approaches to staff 
development* The common elements in their descriptions are (a) 
that inservice education is systematic, following an order or 
sequence, and (b) that the process of inservice education ir 
cyclical— completion of the steps within the process reinitiates the 
process. 

Beneficiaries. The direct beneficiaries of staff development are 
obviously those individuals who participate in such activities. The 
ultimate beneHciaries of staff development, however, are students. 
Benefits are improvement in students' lives (Feiman, 1981), 
achievement of student outc<Mnes (Ehrenberg & Brandt, 1976), 
and impact op. student achievement (Rude, 1978; Wood & 
Thompson, 1980). 

Context Staff development does not take place in a vacuum. Its 
context is defined by both its purposes and beneficiaries. It is the 
OTganizatiOT (i.e^ school system) or a subsystem (e.g., an 
mdividual SC4.00I, department) which provides the backdrop for 
staff development — ^whose clients and goals shape inservice 
education (Brookfield, 1980; DillOTi-Peta-son, 1980). 

Implioitions. Stafif development fc^ most researchers concerned 
with evaluating and validating educational methods and 
procedures might well be defined as a systematic process which, 
in the context of the needs and goals of the school, promotes the 
professicmal growth of staff and ultimately results in benefits for 
students. This definition has several implications for the design of 
application studies: 

1. Staff development for plication studies must fcdlow a 
systematic process. 

2. TTie goals and objectives of staff development must be 
stated in terms of changes on the part of participating 
educators — ^attitudinal, cognitive, and/or performance 
outcomes. 
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3. TTie goals and objecUves of staff development should reflect 
the needs for professiimal development of those individuals 
participating. 

4. Because the context of staff development is the school 
system, it is important to ensure that the goals of staff 
develqjment are not in conflia with the needs and goals of 
the organization. 

5. Evaluation of staff development should include 
determination of the degree to which professionals change 
and the impact of such changes on students. 



Staff Development Practices 

Application studies, by design, place the researcher in the role of 
spectator. ITie researcher must sit on the sidelines of the 
classroom obser/ing whether teachers use the method or 
procedure to which they've been introduced, the fluency with 
which teachers use it, and the degree to which students change as 
a result of the use of the method or procedure. The major "point 
of control" for the researcher conducting an application study is 
the introduction of the the method or procedure to teachers— in 
other words, the inservice training of teachers. In order to design 
staff development activities which produce desired changes in 
teachers (e.g., attitudes, learning, and/or performance), the 
researcher should be acquainted with current best practices in 
inservice education. 

The information presented below relies heavily on three sources 
which provide summaries of research on staff developn^ent in 
educauon. Lawrence (1974) reviewed 97 studies and evaluation 
reports published between 1962 and 1974 to determine the 
charactenstics of successful inservice education programs. In 
1980, Harrison reviewed 47 studies of staff development 
conducted between 1969 and 1979. quantified the results of these 
studies, and synthesized the findings. ITiis process, known as 
meta aiialysis. was also used by Wade (1984) in her analysis of 
91 studies published between 1968 and 1983. 
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Both Harrison (1980) and Wade (1984) use calculations of cffea 
size to describe how dependent variaWes were affected by 
mdependenl variaUes. To determine effect size, the mean 
difierence between treated and control groiq)s is divided by the 
standaid deviation of the control groi^> or some approximation of 
this measure. 

In their meta analyses, Hairison (1980) and Wade (1984) each 
identified four dependent variable classifications. Wade (p. 190) 
uses the following levels of evaluation as dependent variables: 

1. Reaction: Measures of how the participants feel about the 
staff develc^mient activities, usually subjective. 

2. Learning: Objective and quantitative measures that assess 
how much a participant has learned as a result of inservice 
activities. 

3. Behavior: Objective measures thai document whether or 
not participants change their behavior as a result of a staff 
developm^ intervention. 

4. Results: Objectively determining the effects of staff 
development on students of participating teachers or on the 
working environment 

Harrison (1980) classified dependent variables as affective, 
cognitive, performance, and consequence. His definition of 
cognitive variables matches Wade's (1984) definition of learning 
level variaWes, and performance variables match the definition of 
behavior by Wade. Harrison's definition of consequence variables 
is similar to Wade's definition of results, but includes only 
measures of student performance (e.g., tests of cognitive 
knowledge and attitude scales) and not effects on tlie working 
envirormient Harrison's definition of affective dependent 
variables has no analogue in Wade's classification. Affective 
variables are related to "changes in interest, attitudes, and values, 
and the development of appreciations'* (Harrison, 1984, p. 58) by 
participants in staff development activities. 
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Authors vary in the labels they attach to staff deve'opment 
pracdces and Ae categories they use to summarize their findings 
Fbr the purposes of this paper, we have tried to avoid jargon in 
labeling practices and used the following categories to organize 
information about staff development: arrangements, initiation, 
instructor, participant characteristi=s. planning and management, 
and design. 

Arrangements. Both on-site, within the school, and off-site 
mservice training appear to produce professional growth in 
teachers. However, wi-site inservice educadon produced greater 
positive effects on teacher atdtudinal outcomes (Harrison, 1980- 
Uwrence, 1974), teacher perfomiance or behavior outcome^ 
(Harrison, 1980; Uwrence, 1974; V/ade, 1984), and cognitive or 
learning outcomes by teaching (Harrison, 1980; Wade, 1984). 

Draba (1975) has called for limidng the number of participants in 
mseryice training, while Beiman and McLaughlin (1978) have 
idendfied a need for a critical mass of participants. Hairison's 
(1980) and Wade's (1984) findings are inconclusive with respect 
to the number of participants to include in staff development 
activities. Although not statistically significant. Wade's data 
mdicate that, in general, larger groups (over 20) produce slighUy 
higher positive effects. ' 

The time at which staff development activides are scheduled does 
not produce significant differences in effect size. However when 
Wade (1984) analyzed the relation of schedules to cvaluadon 
measures, she found some stadstically significant results. Reacdon 
effects (i.e., the reaction of participants to training activities) were 
most positively influenced by training on weekends, evenings or 
a combination of dmes. Weekends and combination schedules 
produced the most positive effects on participant leaming 
Participant behaviors were most positively affected by weekend 
training and training during the school day. 

Neither the total duradon of training in hours nor the length of 
training over a period of days were found to significanUy 
infiuence effect size (i.e., measures of effecdveness ba^ed on 
statistics reported in each reviewed study in a meta analvsis) 
Both Harrison (1980) and Wade (1984) found slightly higher 
effect sizes for short-term training (i.e., six months or less) as 



49 



60 



compared to long«tcnn (i.e., over six months). In tcnns of the 
number of hours of training, Wade found that longer treatments 
wcie associated with a general lessemng of cffea size. 

Initiitioii. Whcu examining the initiators of staff development 
activities* both Hairison (1980) and Wade (1984) found that the 
miuority of programs were outside-initiated (i.e., originated by 
state/federal government, university researcher, etc.) as opposed 
to initiation within the school by participants, an administrator, or 
a supavisor, Harrison found a higher cffea size for inside- 
initiated programs, although only four of a total of 65 programs 
fell into that categ<My. Wade, on the other hand, found that 
outside-niitiated programs (460 cases) produced almost twice the 
effect size as inside-initiated programs (174 cases). She also 
found s<Mne statistically significant results when analyzing the 
source of program origination m relation to evaluation measures. 
The reaction of participants was most favorable when staff 
development activities were initiated by the state or federal 
government Staff development wiginated by university 
researchers produced the greatest positive effects on participant 
behaviors. An administrator or siq)ervisor originating a staff 
development program produced the greatest positive effects on 
participant and/or student results, with university researchers 
producing the next highest positive results. 

Instructor. Siq)ervjsory staff (12 cases), college personnel (36 
cases), and teachers (5 cases) as instructCM^ of staff development 
activities were found to be more effective than state department 
staff (1 case), consultants (5 cases), and intermediate school 
district staff (2 cases) by Harrison (1980). Wade (1984) included 
the category of self-instruction in her meta analysis and found this 
to produce the highest positive effect size followed by 
supervise^ staff and college personnel. Teachers (57 cases) as 
instructed produced only a small positive effect size. 

Partidpant characteristics. A study by the Rand Corporation of 
federally funded programs aimed at creating innovation or change 
in educational practices within school districts (Berman & 
McLaughlin, 1978) concluded that it is easier to implement and 
continue changes at the elementary level than at the secondary 
level. This finding is corroborated by Wade (1984). Wade also 
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examined the effects of combined participant groups of both 
elementary and secondary personnel. This category produced the 
greatest effect size and, when analyzed by outcome measures, 
produced statistically significant results in ten.is of participant 
reaction^ learning, and behavior. Elementary school groups 
produced statistically significait? effects on participant and/or 
student results. 

Although staff development activities which are voluntary rather 
than obligatory have been called for (Draba, 1975), the research 
findings dc not support this. Hairison's analysis (1980) resulted 
in higher positive effects for obligatory participation, while 
Wade's data (1984) show a higher effect size for voluntary 
participation in inservice education. 

iHTcased status appears to be the most effective incentive for 
inservice training, followed by college credit (Wade, 1984). 
McLaughlin and Bennan (1977) found that release time from 
normal work activities rather than monetary incentives were used 
by school districts which were more effective in their inservice 
training and in maintaining change. Wade's analysis also 
indicates that release time produces greater effects than pay 
incentives. In terms of participant reactions to staff development 
activities, the incentives of status, college credit, and release time 
significanUy enhance effect sizes. Behavior effect sizes are 
significanUy influenced by status and college credit as incentives. 

Ptanning and management. Participant involvement in the 
planning and management of staff development activities has 
frequently been suggested (Draba, 1975; Hutson, 1981; Wood & 
Thompson, 1980) and appears to be supported by research. 
Teacher participation in project decision-making was found to 
strongly correlate with the effective implementation and 
continuation of innovative projects in the Rand study (Bemian & 
McLaughlin, 1978). Lawrence (1974) found that school-based 
mservice education programs conducted by personnel from 
outside the disu-ict were more effective when teachers or 
adminisu-ators were involved as helpers and planners than those 
programs in which they provided no assistance. 

Wade (1984) analyzed the relationship between trainer and 
participants in staff development activities. According to her 
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analytiSt trainers who assume a superordinate role (i.e., in a 
positkm cS authority or expertise) in relation to participants will 
produce a higlier, statistically significant mean efTea size than if a 
coUegial relationship is assumed Reaction, learning, and behavior 
effects were all more positively influenced when the trmner was 
in a siqierordinate role. When participant and/or student results 
are deshed, a coOegial relationship between the trainer and 
paitidpents is more effective* 

Harrison (1980), Lawrence (1974), and Wade (1984) all found 
that active roles (i.e^ generating ideas, behaviors, materials) for 
paiticiponts of inservice education appctix to produce greater 
positive effects than passive roles (i.e., accqpting ideas or 
behavioral prescriptions). Although her fmdings were not 
statistically significant, Wade's results, suggest that an acUve 
participant role produces larger effect sizes for learning and 
behavior outcomes, while a passive or receptive role produces 
larger effect sizes for t^cipant reactions to training. 

Wade (1984) found that staff development which has an 
instructional focus on affective techni.,jes produces significantly 
greater effect sizes for participant reactions to training, participant 
learning, and participant and/or student results. Participant 
behaviors are significantly influenced when the instructional focus 
of inservice education is on improving general teaching rather 
than on improvement of a specific subject or affective techniques. 

GoaliK of inservice training programs can be viewed as commo*) 
(i.c^., partidpents woridng toward a common end/goal) or 
inoividuaVlpei^onal (Le., panidpants working toward different 
goals based on their individual needs). Both Harrison (1980) and 
Wade (1984) found that common goals produced a larger total 
effect size than individual goals, although this finding was not 
statistically significant 

Design. The designer of inservice education must decide whether 
activities included in training will be common to all participants, 
individualized for different teachers, or a combination of these 
iqpproaches. When common and individualized approaches are 
compared, the results are conflicting. Lawrence (1974) found that 
individualized activities were more likely to achieve the objectives 
of training, while Wade (1984) and Harrison (1980) found that 
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coinnx>n activiUes produced a higher mean effect size. Yet in 
Wade's meta analysis, common activities produced a much higher 
effect size on behavioral outcomes than did individualized 
activities, although not statistically significant, 

Hanison was the only author to examine the effects of a 
combinaUon of q^proaches. Although caution must be exercised 
in (hawing any conclusions because of the small niunber of 
studies (7 cases). comWned activities produced a higher effect 
size than either of the other two approaches. 

Numerous instructional methods for staff development are 
available to the inservice educator. Fortunately, a number of 
«:uthors have, examined the effectiveness of different methods of 
instruction m\d some commonality exists in their findings. 

Joyce and Showers (1980) reviewed over 200 studies of training 
methods both alone and in combination, and concluded that 
effective inservice education results when several or all of !he 
following methods are combined: theory presentation/skill 
description; modeling/demonstration; practice; feedback; and 
coaching (i.e.. classroom-based, hands-on assistance provided to 
facilitate the transfer of skills/straiegics to the classroom). More 
specifically, ihey found that the combination of all five methods 
produced mastery of new ways of teaching, while only modeling, 
practice, and feedback were required to produce improvement in 
or fine nming of skills. 

In a second review of u-aining studies. Joyce and Showers (1983) 
examined instructional methods which produced horizontal 
tiansfer of skills (i.e.. situations in which skills can be direcUy 
applied from the training setting into the workplace) and vertical 
u-ansfer of skills (i.e.. situations in which skill adaptation must 
occur in order to successfully zpply skills acquired in the training 
setting to the workplace). The most frequently combined 
mstruclional methods to achieve horizontal skill transfer were 
theory, practice, and feedback. Theory, demonstration, practice, 
feedback, and coaching were most frequently combined to 
produce vertical transfer. 

Uwrence (1974) found that inservice education programs using 
demonstration, practice, feedback, and books as instructional 



ERIC 



53 

3 64 



methods were likely to achieve a high degree of success. 
Observationt dCinotistratioiit lecturet and books (with only five 
caf^) produced the highest positive eflects of the 13 insuiictional 
methods investigated by Harrison (1980). In Wade's (1984) 
examination of 15 different instructional methods, observation, 
micro teaching, video^udio, and practice proved significantly 
more effective than other instnxdonal methods, while lectures, 
games, discussions, and guided field trips produced significandy 
lower effect sizes than other methods. 

Authors who investigated the effects of practice as an 
instructional method found it to be very effective. The 
effectiveness of demonstration as an instructional method is also a 
consistent fmding, if one considers observation as a form of 
demonstration. 



Evaluation 

Evaluation as an established field of applied social research has 
grown rapidly over the past 20 years (Raizen & Rossi, 1981), and 
the importance of evaluation research in education is widely 
recognized (Bernstein & Fre' -nan, 1975; Gcrsten, Gamine, & 
Williams, 1982; Gersten & Hauser, 1984; Raizen ft Rossi, 1981; 
White, 1984; Williams & Ehnore, 1976). The purpose of 
evaluation research is to measure the effects of a program or 
intervention; dial is, to what degree have the changes intended by 
the intervention been achieved and to what extent can these 
changes be ascribed to the intervention (Raizen & Rossi, 1981; 
Weiss, 1972)? 

The term '*c(Hnprehcnsive evaluation** refers to studies that 
include three components: moiutoring; impact; and ex post facto 
cost'beneHt or cost-f^ffectiveness analyses (Rossi, Freeman, & 
Wright, 1979). A comprehensive evaluation provides data to 
determine wheUier the intervention was carried out as {danned, 
whether the intervention resulte I in changes in the intended 
directicm, and what the intervention costs were in relation to its 
benefits. 
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Formative Evahiation: Monitoring Prograni 
Implementation and Service Ddlvery 

The literature is replete with admonitions that attention be paid to 
implementotion in program evaluation and the reasons why 
programs should be monitored. First, monitoring is needed for 
accountaWlity purposes (e.g., who is getting what and how; Rossi, 
Frcemari, & Wright, 1979). Second, monitoring evaluations are 
generally prerequisites to effective impact assessments, since the 
failure of programs is often due to faulty pcrfomiance or 
nonimplcmentation nuher than ineffective interventions (Rossi & 
Wright, 1977). Third, monitoring information may be a 
supplement to, or the sole basis for, deciding whe±^r to continue 
programs (Carlo, 1977; Roos, Roos, Nicol, & Jonnson, 1978). 

Monitoring the delivery of services to evaluate il^e degree of 
program implementation is undertaken for a number of purposes. 
A large proportion of programs ifiat fail to show impact are really 
failures to deliver the interventions in the manner specified. There 
are three potential failures: (a) none (or not enough) of the 
intervention is implemented, (b) the wrong intervention is 
implemented, or (c) the intervention is unstandardized, 
uncontrolled, or varies across implementation, la each instance, 
the need to monitor the delivery cv services and identify 
discrepancies is essential (Rossi, Freeman, & Wright, 1979). An 
intervention may perform poorly at a given school or site, but 
without formative evaluation it is unclear whether the 
performance is due to problems inherent in the intervention or 
problems in the way the intervention was implemented at that 
particular site (House, Glass, McLean, & Decker, 1978; Louck & 
Hall, 1977; Proper, 1980). As a growing body of evidence 
suggests that educational innovations are rarely, if ever, 
implemented exactly as planned, there is a need to collect actual 
implementation data (Gersten At Hauser, 1984). 

Impact Evaluation 

The extent to which an intervention is used depends on a number 
of factors. One factor diat is critical is evidence of effectiveness; 
that is, the r^ogram outcomes and the conditions under which 
implementauon occurs to produce those outcomes (Wang & 
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EUctt, 1981). Impact evaluation is the assessment of the extent to 
which an intervention resulu in desired changes in the taiget 
population. Questions that need to be asked in an impact 
evaluation include: (a) Is the intervention effective in reaching the 
intended goals, (b) can resulte be explained by other variables 
which are not part of the intervention, and (c) has the intervention 
resulted in unintended effects? For an intervention to have 
impact, it must result in movement toward desired objectives. 

When conducting an impact evaluation, there must be a plan for 
the colIecti(m of data. Ihe data collection plan should allow the 
investigator to demcmstrate that the outcomes 4at occurred were 
the result of the intervention, and to reject any competing 
explanations or confounding effects. Therefore, impact 
evaluations need to be undertaken as systematically and 
rigorously as possible in order to document the causal linkages 
between intervention inpy.ts and program outcomes. 

The critical issue in impact evaluation is whether or not a 
program has produced significantly more of an effect than would 
have occurred without the mtervcntion. Two prerequisites to an 
effective impact evaluation are having: (a) goals that are clearly 
defined so that it is possible to measure goal attai^men^ and (b) 
evidence that the intervention is sufficiently implemented. 
Initiating impact evaluations requires the identification and 
exjdication of oat or more outcome measures that both reflect the 
intervention goals and which are sensitive enough to allow 
measurement of change if the intervention is effective. 

Impact evaluation for pupils wHh severc/proround handicaps. 
The most comnaoa method of impact assessment has been the 
comparison between large-n experimental and control groups 
(Rossi, Freeman, & Wright, 19-/9; White, 1984). 

For program evaluation in the area of the severely handicapped, 
group comparison designs arc generally not feasible. As White 
0984) points cut, 

ITie greatest proWem in the application of "large-n" 
i^oaches to the evaluation of programs serving 
handicapped populations lies in the simple fact thai the total 
number of such children in any given program unit is likely 
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to be small. In a hypothetical district of 20,000, it is very 
unlikely that two or thrr* leverdy^foundly handicapped 
individuals could be found who matched each other 
rcMonably well oo even the most obvious educadonally 
relevant variables. [That problem) virtually nallifies the 
possioUity of utilizing the vast majority of tradidonal 
strate^es for the evaluation of programs serving the 
severely handicapped population. 

An altemadve to groiqj comparison designs that has been 
suggested by White and others (Campbell & Stanley. 1966; Fitz- 
Gibbon & Monis. 1978; Hv.sen & Barlow. 1976) is single subject 
time scries analysis. A series of measures, usually given at equal 
intervals before and after thv*. intervention, is called a time scries. 
A series of measures systemiiiically taken before a program starts 
can actually eliminate the msed for a conaol group. The single 
subject dme series design uses tiV students in the program as their 
own control group, what White talis a "perfect match." Richard 
Jones (1979) advocated using single subject designs for formative 
evaluation of individual program components and then following 
vp with group designs to evala=ite the overall effectiveness of a 
program in helping groups of students. (For a detailed discussion 
of the pros and cons of different research evaluation designs, see 
White. 1984). Whatever the particular research design, it is 
important for educators to use evaluation to determine how u.uch 
students Icam and whether this learning can be likened to a 
particular educational approach (Gersien & Hauser. 1984). 

Cost Analysts Procedures In Education 

As the resources now available to school districts are scarce, 
decisions on altcniative uses of limited resources need to be 
made. In the field of education, therefore, cost factors are now 
analyzed when making program decisions. Cost factors in both 
new educational programs and proposed changes in existing 
programs involve looking at several categories of cost o"cr time 
(Haller. 1974; Uvin. 1983; Sorenscn & Dinner. 1979). Research 
2nd devc' -nent costs are those resources required to develop a 
Pi-ogram sufficiently for introduction into the system. Invcsunent 
costs are those necessary to implement the program (e.g.. special 
equipment, training, etc.). Finally, there are operating costs, those 
recuinng costs required to operate a program over time. 
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The first step in deciding on a model for cost analysis is to decide 
what model for evaluation would be most appropriate. Two 
models arc used predominantly in the social sciences— cost 
benefit and cost effectiveness approaches (Alkin, 1970; Baikdoll, 
1980; Hallcr, 1979; Sweeny & Blaschke, 1980). In cost benefit 
analysis, there is an evaluation of alternatives when costs and 
bendTits are measure ' in monetary terms. It requires deciding the 
value of swh things, and assigning a dollar value to educational 
outcomes is a subjective process at best (Weinrott, et al^ 1983). 
Cost effectiveness analysis involves evaluation of alternatives 
according to both their costs and their effects with regard to 
I^oducing some outcome or set of outcomes (Schnell, et al., 
1979). Under cost effectiveness analysis, both the costs and 
effects of alternatives are taken into account in evaluating 
programs with similar goals. It is assumed that (a) only programs 
with similar or identical goals can be compared and (b) a 
common measure of effectiveness can be used to assess them 
(AUdn, 1970; Schnelle. et al., 1979; Sorensen & Binner, 1979; 
Weimott, Jones, & Howard, 1983), 



Summary 

For the educational researcher, application studies are a loffcal 
and critical extension of basic and j^yplied research. They provide 
information on the general utility, feasibility, cost effectiveness, 
and potential adaptations of educational methods and procedures 
prior to their broad disseminaticm. To conduct application studies, 
the researcher nust consider how to design: (a) staff development 
activities which will cnaWe educators in schools to implement the 
methods and procedures of basic and iq)plied educational 
research, and (b) a comprehensive evaluation plan which provides 
information on the degree of implementation, the level of impact, 
and the relative costs of th*) technique under sttidy. 
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